Goto

Collaborating Authors

 independence structure


Cooperative neural networks (CoNN): Exploiting prior independence structure for improved classification

Neural Information Processing Systems

We propose a new approach, called cooperative neural networks (CoNN), which use a set of cooperatively trained neural networks to capture latent representations that exploit prior given independence structure. The model is more flexible than traditional graphical models based on exponential family distributions, but incorporates more domain specific prior structure than traditional deep networks or variational autoencoders. The framework is very general and can be used to exploit the independence structure of any graphical model. We illustrate the technique by showing that we can transfer the independence structure of the popular Latent Dirichlet Allocation (LDA) model to a cooperative neural network, CoNN-sLDA. Empirical evaluation of CoNN-sLDA on supervised text classification tasks demonstrate that the theoretical advantages of prior independence structure can be realized in practice - we demonstrate a 23 percent reduction in error on the challenging MultiSent data set compared to state-of-the-art.





Cooperative neural networks (CoNN): Exploiting prior independence structure for improved classification

Neural Information Processing Systems

We propose a new approach, called cooperative neural networks (CoNN), which use a set of cooperatively trained neural networks to capture latent representations that exploit prior given independence structure. The model is more flexible than traditional graphical models based on exponential family distributions, but incorporates more domain specific prior structure than traditional deep networks or variational autoencoders. The framework is very general and can be used to exploit the independence structure of any graphical model. We illustrate the technique by showing that we can transfer the independence structure of the popular Latent Dirichlet Allocation (LDA) model to a cooperative neural network, CoNN-sLDA. Empirical evaluation of CoNN-sLDA on supervised text classification tasks demonstrate that the theoretical advantages of prior independence structure can be realized in practice - we demonstrate a 23 percent reduction in error on the challenging MultiSent data set compared to state-of-the-art.


Uncovering Meanings of Embeddings via Partial Orthogonality

Neural Information Processing Systems

Machine learning tools often rely on embedding text as vectors of real numbers. In this paper, we study how the semantic structure of language is encoded in the algebraic structure of such embeddings.



Conditional Independence Estimates for the Generalized Nonparanormal

Shah, Ujas, Lladser, Manuel, Morrison, Rebecca

arXiv.org Machine Learning

For general non-Gaussian distributions, the covariance and precision matrices do not encode the independence structure of the variables, as they do for the multivariate Gaussian. This paper builds on previous work to show that for a class of non-Gaussian distributions -- those derived from diagonal transformations of a Gaussian -- information about the conditional independence structure can still be inferred from the precision matrix, provided the data meet certain criteria, analogous to the Gaussian case. We call such transformations of the Gaussian as the generalized nonparanormal. The functions that define these transformations are, in a broad sense, arbitrary. We also provide a simple and computationally efficient algorithm that leverages this theory to recover conditional independence structure from the generalized nonparanormal data. The effectiveness of the proposed algorithm is demonstrated via synthetic experiments and applications to real-world data.


Reviews: Cooperative neural networks (CoNN): Exploiting prior independence structure for improved classification

Neural Information Processing Systems

Summary: The authors propose a new method that combines a latent Dirichlet allocation (LDA) model with a neural network architecture for the application of supervised text classification –– a model that can be trained end-to-end. In particular, they use a network structure to approximate the intractable inference equations that solve the KL-divergence between the LDA posterior and its approximation which is based on marginal distributions. The authors show that an embedding in a Hilbert space can allow for the approximation of the inference equations, and they choose neural networks to parametrize the functional mapping. Finally, based on two applications, the authors demonstrate an incremental advancement over previous models. Clarity: The overall writing is good, especially as it is a very technical paper with many mathematical details.


Uncovering Meanings of Embeddings via Partial Orthogonality

Jiang, Yibo, Aragam, Bryon, Veitch, Victor

arXiv.org Machine Learning

Machine learning tools often rely on embedding text as vectors of real numbers. In this paper, we study how the semantic structure of language is encoded in the algebraic structure of such embeddings. Specifically, we look at a notion of ``semantic independence'' capturing the idea that, e.g., ``eggplant'' and ``tomato'' are independent given ``vegetable''. Although such examples are intuitive, it is difficult to formalize such a notion of semantic independence. The key observation here is that any sensible formalization should obey a set of so-called independence axioms, and thus any algebraic encoding of this structure should also obey these axioms. This leads us naturally to use partial orthogonality as the relevant algebraic structure. We develop theory and methods that allow us to demonstrate that partial orthogonality does indeed capture semantic independence. Complementary to this, we also introduce the concept of independence preserving embeddings where embeddings preserve the conditional independence structures of a distribution, and we prove the existence of such embeddings and approximations to them.